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In the claims: 

1 . (Currently Amended) A voice recognition system comprising a spectrum 
converter for elongating or contracting athe spectrum of a voice signal on athe frequency axis, 
the spectrum converter including: 

an analyzer for converting an input voice signal to an input pattern including 
cepstrum; 

a reference pattern memory with reference patterns stored therein; 

an elongation/contracting estimating unit for outputting an elongation/contraction 
parameter in the frequency axis direction by using the input pattern and the reference patterns; 
and 

a converter for converting the input pattern by using the elongation/contraction 
parameter; 

wherein said elongating or contracting of the spectrum of the voice signal is carri ed 
out using an expansion-compression coefficient obtained by retrieval in two dimensional 
space such that one value of the coefficient is obtained for each utterance . 

2, (Currently Amended) A voice recognition system comprising: 

an analyzer for converting an input voice signal to an input pattern including a 
cepstrum; 

a reference pattern memory for storing reference patterns; 

an elongation/contraction estimating unit for outputting an elongation/contraction 
parameter in the frequency axis direction by using the input pattern and reference patterns; 

a converter for converting the input pattern by using the elongation/contraction 
parameter; and 
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a matching unit for computing the distances between the elongated or contracted input 
pattern fed out from the converter and the reference patterns and outputting the reference 
pattern corresponding to the shortest distance as result of recognition; 

wherein said elongation/contraction parameter is based on an expansion-compression 
coefficient obtained by retrieval in two dimensional space such that one value of the 
coefficient is obtained for each utterance . 

3. (Currently Amended) The voice recognition system according to claim 1, 
wherein the converter executes the elongation or contraction of the spectrum on the frequency 
axis with a_warping function defining the form of elongation or contraction by carrying out 
the elongation or contraction in cepstrum space. 

4. (Currently Amended) The voice recognition system according to claim 1, 
wherein the elongation/contraction estimating unit executes the elongation or contraction of 
the_spectrum on the frequency axis with awarping function defining the form of elongation or 
contraction by using estimation derived from the best likelihood estimation of HMM (hidden 
Marcov model) in a_cepstrum space. 

5. (Currently Amended) A reference pattern learning system comprising: 

a learning voice memory with learning voice data stored therein; 

an analyzer for receiving a learning voice signal from the learning voice memory and 
converting the learning voice signal to an input pattern including cepstrum; 

a reference pattern memory with reference patterns stored therein; 

an elongation/contraction estimating unit for outputting an elongation/contraction 
parameter in ^frequency axis direction by using the input pattern and the reference patterns; 
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a converter for converting the input pattern by using the elongation/contraction 

pattern; 

a reference pattern estimating unit for updating the reference patterns stored in the 
reference pattern memory for the learning voice data by using the elongated or contracted 
input pattern fed out from the converter and the reference patterns; and 

a likelihood judging unit for monitoring distance changes by computing distances by 
using the elongated or contracted input pattern fed out from the converter and the reference 
patterns; 

wherein said elongation/contraction parameter is based on an expansion-compression 
coefficient obtained by retrieval in two d imensional space such that one value of the 
coefficient is obtained for each utterance . 

6. (Currently Amended) The reference pattern learning system according to 
claim 5, wherein the converter executes the elongation or contraction of spectrum on the 
frequency axis with awarping function defining the form of elongation or contraction by 
carrying out the elongation or contraction in cepstrum space. 

7. (Currently Amended) The reference pattern learning system according to 
claim 5, wherein the elongation/contraction estimating unit executes the elongation or 
contraction of spectrum on the frequency axis with a^warping function defining the form of 
elongation or contraction by using estimation derived from the best likelihood estimation of 
HMM (hidden Marcov model) in cepstrum space. 
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an analyzer for converting an input voice signal to an input pattern including a 
cepstrum; 

a reference pattern memory for storing reference patterns; 

an elongation/contraction estimating unit for outputting an elongation/contraction 
parameter in the frequency axis direction by using the input pattern and reference patterns; 

a converter for converting the input pattern by using the elongation/contraction 
parameter; and 

an inverse converter for outputting a signal waveform in time domain by inversely 
converting the time serial input pattern obtained after the elongation/contraction supplied 
from the converter 

wherein said elongation/contraction parameter is based on an expansion-compression 
coefficient obtained by retrieval in two dimensional space such that one value of the 
coefficient is obtained for each utterance . 

9. (Currently Amended) A recording medium for a computer constituting a 
spectrum converter by executing elongation or contraction of the spectrum of a voice signal 
on frequency axis, in which is stored a program for executing the following processes: 

(a) an analyzing process for converting an input voice signal to an input pattern 
including cepstrum, 

(b) an elongation/contraction estimating process for outputting an 
elongation/contraction parameter in frequency axis direction by using the input pattern and 
reference patterns stored in a reference pattern memory; and 

(c) a converting process for converting the input pattern by using the 

elongation/contraction parameter 

wherein said elongation/contraction parameter is based on an expansion-compression 
coefficient obtained by retrieval in two dimensional space such that one value of the 
coefficient is obtained for each utterance. 
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10. (Currently Amended) A recording medium for a computer constituting a 
system for voice recognition by executing elongation or contraction of atfee spectrum of a 
voice signal on a_frequency axis, in which is stored a program for executing the following 
processes: 

(a) an analyzing process for converting an input voice signal to an input pattern 
including cepstrum, 

(b) an elongation/contraction estimating process for outputting an 
elongation/contraction parameter along the i n-frequency axis dir e ction by using the input 
pattern and reference patterns stored in a reference pattern memory; 

(c) a converting process for converting the input pattern by using the 
elongation/contraction parameter; and 

(d) a matching process for computing the distances between the elongated or 

contracted input pattern and the reference patterns and outputting the reference pattern 

corresponding to the shortest distance as result of recognition 

wherein said elongation/contraction parameter is based on an expansion-compression 
coefficient obtained by retrieval in two dimensional space such that one value of the 
coefficient is obtained for each utterance . 

11. (Currently Amended) The recording medium according to claim 10, wherein 
the converting process executes the elongation or contraction of spectrum on the frequency 
axis with a_warping function defining the form of elongation or contraction by carrying out 
the elongation or contraction in cepstrum space. 



Atty. Dkt. No. 071671-0156 

12. (Currently Amended) The recording medium according to claim 10, wherein 
the elongation/contraction estimating process executes the elongation or contraction of the 
spectrum on the frequency axis with awarping function defining the form of elongation or 
contraction by using estimation derived from the best likelihood estimation of HMM (hidden 
Marcov model) in cepstrum space. 

13. (Currently Amended) In a computer constituting a system for learning 
reference patterns from learning voice data, a recording medium, in which is stored a 
program, for executing the following processes: 

(a) an analyzing process for receiving learning voice data from learning voice 
memory with learning voice data stored therein and converting the received learning voice 
data to an input pattern including cepstrum; 

(b) an elongation/contraction estimating process for outputting an 
elongation/contraction parameter along a m-frequencv axis direction by using the input 
pattern and the reference patterns stored in the reference pattern memory; 

(c) a converting process for converting the input pattern by using the 
elongation/contraction parameter; 

(d) a reference pattern estimating process for updating the reference patterns for 
the learning voice data by using the elongated or contracted pattern fed out in the converting 
process and the reference patterns and; 

(e) a likelihood judging process for calculating the distances between the 
elongated or contracted input pattern after conversion in the converting process and the 
reference patterns and monitoring changes in distance 
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wherein said elongation/contraction parameter is based on an expansion-compression 
coefficient obtained by retrieval in two dimensional space such that one value of the 
coefficient is obtained for each utterance . 

14. (Currently Amended) The recording medium according to claim 13, wherein 
the converting process executes the elongation or contraction of the spectrum on the 
frequency axis with a_warping function defining the form of elongation or contraction by 
carrying out the elongation or contraction in cepstrum space. 

15. (Currently Amended) The recording medium according to claim 13, wherein 
the elongation/contraction estimating process executes the elongation or contraction of the 
spectrum on the frequency axis with a_warping function defining the form of elongation or 
contraction by using estimation derived from the best likelihood estimation of HMM (hidden 
Marcov model) in cepstrum space. 

16. (Currently Amended) A recording medium for a computer constituting a 
spectrum conversion by executing elongation or contraction of the spectrum of a voice signal 
on a_frequency axis, in which is stored a program for executing the following processes: 

(a) an analyzing process for converting an input voice signal to an input pattern 
including cepstrum, 

(b) an elongation/contraction estimating process for outputting an 
elongation/contraction parameter along the in-frequency axis dir e ction by using the input 
pattern and reference patterns stored in a reference pattern memory; 

(c) a converting process for converting the input pattern by using the 
elongation/contraction parameter; and 
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(d) an inverse converting process for outputting a signal waveform in time domain 
by inversely converting the time serial input pattern obtained after the elongation/contraction 
supplied from the converter 

wherein said elongation/contraction parameter is based on an expansion-compression 
coefficient obtained by retrieval in two dimensional space such that one value of the 
coefficient is obtained for each utterance . 

17. (Currently Amended) A spectrum converting method for elongating or 
contracting atha spectrum of a voice signal on athe frequency axis, comprising: 

a first step for converting an input voice signal to an input pattern including cepstrum; 

a second step for outputting an elongation/contraction parameter in the frequency axis 
direction by using the input pattern and the reference patterns stored in a reference pattern 
memory; and 

a third step for converting the input pattern by using the elongation/contraction 
parameter 

wherein said elongation/contraction parameter is based on an expansion-compression 
coefficient obtained by retrieval in two dimensional space such that one value of the 
coefficient is obtained for each utterance . 

18. (Currently Amended) A voice recognition method comprising: 

a first step for converting an input voice signal to an input pattern including a 
cepstrum; 

a second step for outputting an elongation/contraction parameter in the along a 
frequency axis dir e ction by using the input pattern and reference patterns stored in a reference 
pattern memory; 

a third step for converting the input pattern by using the elongation/contraction 
parameter; and 
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a fourth step for computing the distances between the elongated or contracted input 

pattern arid the reference patterns and outputting the reference pattern corresponding to the 

shortest distance as result of recognition 

wherein said elongation/contraction parameter is based on an expansion-compression 
coefficient obtained by retrieval in two dimensional space such that one value of the 
coefficient is obtained for each utterance . 

19. (Currently Amended) The voice recognition method according to claim 17, 
wherein the elongation or contraction of the spectrum on the frequency axis with a^warping 
function defining the form of elongation or contraction is executed by carrying out the 
elongation or contraction in cepstrum space. 

20. (Currently Amended) The voice recognition method according to claim 1 7, 
wherein the elongation/contraction estimating process executes the elongation or contraction 
of the spectrum on the frequency axis with awarping function defining the form of elongation 
or contraction by using estimation derived from the best likelihood estimation ofHMM 
(hidden Marcov model) in cepstrum space. 

21. (Currently Amended) A reference pattern learning method comprising: 

a first step for receiving a learning voice signal from the learning voice memory and 
converting the learning voice signal to an input pattern including cepstrum; 

a second step for outputting an elongation/contraction parameter along a m -frequency 
axis dir e ction by using the input pattern and the reference patterns stored in a reference 
pattern memory; 
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a third step for converting the input pattern by using the elongation/contraction 

pattern; 

a fourth step for updating the reference patterns for the learning voice data by using 
the elongated or contracted input pattern and the reference patterns; and 

a fifth step for monitoring distance changes by computing distances by using the 
elongated or contracted input pattern and the reference patterns 

wherein said elongation/contraction parameter is based on an expansion-compression 
coefficient obtained by retrieval in two dimensional space such that one value of the 
coefficient is obtained for each utterance . 

22. (Currently Amended) The reference pattern learning method according to 
claim 21, wherein the third step executes the elongation or contraction of the spectrum on the 
frequency axis with ajwarping function defining the form of elongation or contraction by 
carrying out the elongation or contraction in cepstrum space. 

23. (Currently Amended) The reference pattern learning method according to 
claim 21, wherein the second step executes the elongation or contraction of the spectrum on 
the frequency axis with a_warping function defining the form of elongation or contraction by 
using estimation derived from the best likelihood estimation of HMM (hidden Marcov 
model) in cepstrum space. 

24. (Currently Amended) A voice recognition method of spectrum conversion to 
convert athe spectrum of a voice signal by executing elongation or contraction of the 
spectrum on a_frequency axis, wherein: 

the sp e ctrum elongation or contraction of the spectrum of the inpu^voice signal is as 
defined by a warping function and is executed on cepstrum, and the extent of elongation or 
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contraction of the spectrum on the frequency axis is determined with an 
elongation/contraction parameter included in the warping function, and an optimum value is 
determined as elongation/contraction parameter value for each speaker 

wherein said elongation/contraction parameter is based on an expansion-compression 
coefficient obtained by retri eval in two dimensi onal space such that one value of the 
coefficient is obtained for each utterance . 

--25. (Currently Amended) The voice recognition system according to claim 2, 
wherein the converter executes the elongation or contraction of the spectrum on the frequency 
axis with awarping function defining the form of elongation or contraction by carrying out 
the elongation or contraction in cepstrum space. 

26. (Currently Amended) The voice recognition system according to claim 2, 
wherein the elongation/contraction estimating unit executes the elongation or contraction of 
the spectrum on the frequency axis with awarping function defining the form of elongation or 
contraction by using estimation derived from the best likelihood estimation of HMM (hidden 
Marcov model) in cepstrum space. 

27. (Currently Amended) The voice recognition system according to claim 3, 
wherein the elongation/contraction estimating unit executes the elongation or contraction of 
the_spectrum on the frequency axis with awarping function defining the form of elongation or 
contraction by using estimation derived from the best likelihood estimation of HMM (hidden 
Marcov model) in cepstrum space. 

28. (Currently Amended) The reference pattern learning system according to 
claim 6, wherein the elongation/contraction estimating unit executes the elongation or 
contraction of the_spectrum on the frequency axis with awarping function defining the form 
of elongation or contraction by using estimation derived from the best likelihood estimation 
of HMM (hidden Marcov model) in cepstrum space. 

29. (Currently Amended) The voice recognition method according to claim 18, 
wherein the elongation or contraction of the spectrum on the frequency axis with awarping 
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function defining the form of elongation or contraction is executed by carrying out the 
elongation or contraction in cepstrum space. 

30. (Currently Amended) The voice recognition method according to claim 18, 
wherein the elongation/contraction estimating process executes the elongation or contraction 
of the spectrum on the frequency axis with awarping function defining the form of elongation 
or contraction by using estimation derived from the best likelihood estimation of HMM 
(hidden Marcov model) in cepstrum space. 

31. (Currently Amended) The voice recognition method according to claim 19, 
wherein the elongation/contraction estimating process executes the elongation or contraction 
of the spectrum on the frequency axis with a_warping function defining the form of elongation 
or contraction by using estimation derived from the best likelihood estimation of HMM 
(hidden Marcov model) in cepstrum space.— 
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